Syllable-based Machine Transliteration with Extra Phrase Features
نویسندگان
چکیده
This paper describes our syllable-based phrase transliteration system for the NEWS 2012 shared task on English-Chinese track and its back. Grapheme-based Transliteration maps the character(s) in the source side to the target character(s) directly. However, character-based segmentation on English side will cause ambiguity in alignment step. In this paper we utilize Phrase-based model to solve machine transliteration with the mapping between Chinese characters and English syllables rather than English characters. Two heuristic rulebased syllable segmentation algorithms are applied. This transliteration model also incorporates three phonetic features to enhance discriminative ability for phrase. The primary system achieved 0.330 on Chinese-English and 0.177 on English-Chinese in terms of top-1 accuracy.
منابع مشابه
Phrase-Based Transliteration with Simple Heuristics
This paper presents modeling of transliteration as a phrase-based machine translation system. We used a popular phrasebased machine translation system for English-Hindi machine transliteration. We have achieved an accuracy of 38.1% on the test set. We used some basic rules to modulate the existing phrased-based transliteration system. Our experiments show that phrase-based machine translation s...
متن کاملSyllable-Based Thai-English Machine Transliteration
This article describes the first trial on bidirectional Thai-English machine transliteration applied on the NEWS 2010 transliteration corpus. The system relies on segmenting sourcelanguage words into syllable-like units, finding unit's pronunciations, consulting a syllable transliteration table to form target-language word hypotheses, and ranking the hypotheses by using syllable n-gram. The app...
متن کاملEnglish-Hindi Transliteration Using Context-Informed PB-SMT: the DCU System for NEWS 2009
This paper presents English—Hindi transliteration in the NEWS 2009 Machine Transliteration Shared Task adding source context modeling into state-of-the-art log-linear phrase-based statistical machine translation (PB-SMT). Source context features enable us to exploit source similarity in addition to target similarity, as modelled by the language model. We use a memory-based classification framew...
متن کاملHindi Transliteration Using Context - Informed PB - SMT : the DCU System for NEWS 2009
This paper presents English—Hindi transliteration in the NEWS 2009 Machine Transliteration Shared Task adding source context modeling into state-of-the-art log-linear phrase-based statistical machine translation (PB-SMT). Source context features enable us to exploit source similarity in addition to target similarity, as modelled by the language model. We use a memory-based classification framew...
متن کاملA Hybrid Approach of English- Hindi Named-entity Transliteration
In recent years, machine transliteration has gained a center of attention for research. Both machine translation and transliteration are important for e-governance and web based online multilingual applications. As machine translation translate source language to target language which results in wrong translation for named entities. Named entities are required to be translated with preserving t...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2012